Incrementally Updating the SMT Reordering Model
نویسنده
چکیده
This work is concerned with incrementally training statistical machine translation (SMT) models when new data becomes available. That, in contrast to re-training new models based on the entire accumulated data. Incremental training provides a way to perform faster, more frequent model updates, enabling keeping the SMT system up-to-date with the most recent data. Specifically, we address incrementally updating the reordering model (RM), a component in phrase-based machine translation that models phrase order changes between the source and the target languages, and for which incremental training has not been proposed so far. First, we show that updating the reordering model is helpful for improving translation quality. Second, we present an algorithm for updating the reordering model within the popular Moses SMT system. Our method produces the exact same model as when training the model from scratch, but doing so much faster.
منابع مشابه
Phrase Reordering Model Integrating Syntactic Knowledge for SMT
Reordering model is important for the statistical machine translation (SMT). Current phrase-based SMT technologies are good at capturing local reordering but not global reordering. This paper introduces syntactic knowledge to improve global reordering capability of SMT system. Syntactic knowledge such as boundary words, POS information and dependencies is used to guide phrase reordering. Not on...
متن کاملA Hybrid Machine Translation System Based on a Monotone Decoder
In this paper, a hybrid Machine Translation (MT) system is proposed by combining the result of a rule-based machine translation (RBMT) system with a statistical approach. The RBMT uses a set of linguistic rules for translation, which leads to better translation results in terms of word ordering and syntactic structure. On the other hand, SMT works better in lexical choice. Therefore, in our sys...
متن کاملAn Ngram-based reordering model
This paper describes in detail a novel approach to the reordering challenge in statistical machine translation (SMT). This Ngram-based reordering (NbR) approach uses the powerful techniques of SMT systems to generate a weighted reordering graph. Thus, statistical criteria reordering constraints are supplied to an SMT system, and this allows an extension to the SMT decoding search. The NbR appro...
متن کاملDependency-based Reordering Model for Constituent Pairs in Hierarchical SMT
We propose a novel dependency-based reordering model for hierarchical SMT that predicts the translation order of two types of pairs of constituents of the source tree: head-dependent and dependent-dependent. Our model uses the dependency structure of the source sentence to capture the mediumand long-distance reorderings between these pairs of constituents. We describe our reordering model in de...
متن کاملEstimating Word Alignment Quality for SMT Reordering Tasks
Previous studies of the effect of word alignment on translation quality in SMT generally explore link level metrics only and mostly do not show any clear connections between alignment and SMT quality. In this paper, we specifically investigate the impact of word alignment on two pre-reordering tasks in translation, using a wider range of quality indicators than previously done. Experiments on G...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2014